PASSIVE SMOKING AND LUNG CANCER: REANALYSES OF 
HDLWAMAS DATA 
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ABSTRACT 

The cohort ihv«stigated by HlRAYAMA has a serve selection bias by. age. The 
effect of removing this selection bias by iterative proportional fitting a 
contingency, table to given marginals is investigated. The risk increase re¬ 
ported by HlRAYAMA disappears completely when one removes selection bias by 
age. If the cases would have been observed as they occur in the female popu¬ 
lation one would have observed no risk increase. Only ini the subgroup, of woi- 
men married to industry workers there remains a risk increase, which might 
be due to confounding factors. Assuming modest differential! misclassificatibn 
also leads to risk ratios around unity. 


INTRODUCTION 

The statistical association: between enviromentai tobacco smoke and! lung cancer 
is controversial. The HlRAYAMA study seems to provide sound epid^mtolbgical 
evidence supporting this hypothesis. In a recent paper UfeERLA (6) has analysed 
the published studies. Regarding the HlRAYAMA study the following facts have to 
be kept in.mindt 

The study was not designed to test the hypothesis, whether passive smoking 
is associated with: lUng. cancer or not. It can therefore only generate this 
hypothesis, not prove it. 

The cohort was not representative for the population of Japan. A selection 
bias is possible. 

The exposure indicator - the fact of being married to a man who smokes - 
is not reliable, not' valid and not specific. 

The event indicator - dying oni lung cancer as noted on death certificates - 
is neither reliable r>or valid. 

Various confounding factors - for instance exposure at the working place, 
indoor air pollution, overall air pollution, type of medical care • were 
rtot accounted for. 

Bias in registering the fact, that a woman is a nonsmoKar, was not con¬ 
trolled. Resulting diffarential misclassifications of the cases, who were 
smokers and had to be excluded^ have not been considered. 

- Almost rwthing is known about the 200 Casas. No cast reports are available, 
autopsy and histology are only availabia in 113 %. 
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The care of the information,, on which the results of this study rely., i$ 

1. ) that durtngi 1&65 200 women in Japan, told an interviewer on a singlfe 

occasion that they werie - dunng that; time - non-smokers and their 
husbands told, that they, were smokers, which might have been; dufferent 
before and afterwards and 

2. ): that their death certificates subsequently contained the diagnosis lung 

cancer,, which might have been erroneous. 

Such sparse information does not seem to be convincing.. 

In Our paper we consider four questior^: 

1.) What is the relative nsK ....en one removes the selection bias regarding 
age of women in the HIRAYAMA cohort? 

2:) What is the relative risk when one additionally, accounts for the fact 
that women above 70 who are rriarned to husbands still living are less 
frequent, than reported m the population statistic? 

3. ) What is the relative nsk for women married to men w,ith: diiffer,ent occu¬ 

pations, when one removes the selection'bias regarding age of men?' 

4. )i What' is the relative nsk when additionally some modest' differential 

misclassification ts. assumed? 

MATERIALS AND METHODS 

We start from tables 1,,2 and 3 of HIRAYAMA 1964 (4). These tables contain the 
most detailed published data. Ih order to check our program, we reproduced, some 
of the reported relative risks with good: accuracy. 


AGE, 

GRO'lP 


PERCENT FEMALE 


JAPAN' 

POPUUATi'ON 


HIRAYAMA 

COHORT 


table 1:: Dif f erences between the HiR AY AM.A, cohort and the fen-,aie age distru- 
buticn over 40: in; the population, of JAPAN 1965 (Pcpulation census. 1965. Sta¬ 
tistical! survey, of the economy of Japan. 1967, Ministry of Foreign Affairs of 
J3pan)L 

There are marked differences between the HIRAYAMA cohort and the female age 
distribution over, 40 in the populationi of Japan 1965. W.omen 50-59 are over¬ 
represented, women older than 70 are severely underrepresentedi In this age 
group only one percent was observed instead of 12 percent in the population. 

The investigated cohort certainly trs a severe selection bias by. age, which 
needs no'statistical test. This is likely due to the fact', that: the smoking 
behaviour was not known in the elderly or that the husbands of older women 
have died. Since it takes twenty years and more fromi exposure to lung cancer, 
older women, surely are relevant and should: not. be excluded. The majority of 
lung cancer cases occur m older age groups, in Germany, more than 67 % m 
women, over 65 yeers. 
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In order to ansv^er the qyestion what the relative risk is when the age selec¬ 
tion bias is removed, we adjusted the data to the age distribution: of the fe¬ 
male poputation of Japan; The technique of iterative proportionali fitting a 
contingency table tO'given marginals as described by, BISHOP, FlENBERG and 
HOLLAND (1): or; by HARTiUNG (3) was used. This technique keeps the risks con^ 
stant as observed in every celt and'changes the marginals and the cell counts 
according to the given age distribution of the population. Iterative propor- 
tionali fitting of contingency tables to given marginals is a well known tech¬ 
nique in multivariate statistics and can be applied here without changing the 
observed: interrelations between smoking habit, occupation and lung cancer. From 
the fitted or adjusted tables the risk ratios are calculated m the usual way. 

Such risk ratios based oni data with removed age selection bias are the correct 
ones and should be used^ One has to require that there should be no seliection 
bias by age and the cases should: be included as they would have occured: ini the 
population. Otherwise statistical tests and P-values are not very meaningful. 


WIVES 

AGE 


NON 

HUSBAND’S SMOKING 
1 - 19 

HABITS 

20 ♦ 


TOTAL 

40-49 

4 

7918 

21 

17492 

21 

12615 

46 

38025 

50-59: 

14i 

7635 

46 

15640 

31 

8814 

91' 

32089: 

60-69 

16 

6170 

31 

10381 

10: 

3793 

57, 

20344 

70 + 

3 

T72 

1 

671 

2 

239 

6 

1082 

total 

37 

21895 

99 

44184 

64 

25461 

200 

91540; 


TABLE 2: SMOKING HABIT OF HUSBAND BY AGE OF WIFE. ORlGlNIAt DATA 
(Table 2 of HIRAYAMA 1984). 


Table 2 shows the original data by age of wife. The cells contami the number of 
lung cancer cases, and those under nsk as published by HIRAYAMA. The V19 group 
includes ex-smokers in this and the following tables. 20Q: cases out of 9T540' 
women were observed. Iterative proportional fitting to the female age distri¬ 
bution of the papulation leaves the underlined numbers constant. The others are 
adjusted using a rigth hand marginal which is made proportional to the age di^ 
stribution of the population. 


Table 3 gives the results of iterative proportional fitting to the female age 
distribution, of the population; It contains the numbers of those under risk 
and of lung cancer deaths as they would have been ovserved, if HIRAYAMA had 
not excluded or preferred: certain age groups. The age selection bias is re¬ 
moved. The risks in the individual! cells ere still the seme as those observed 
by HIRAYAMA. Also the structure of the common distribution regerding age, smo¬ 
king habit and: lung cancer is unchanged. HIRAYAMA would have totally observed 
232 cases instead of 200, with the corresponding numbers in the individual 
celts,, had he included all women as they live in the population. This tablis 
is the best available starting point for age-adjusted risk ratio calculations. 

It was r^ot used so far. 
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HUSBAND'S SMOKING HABITS 


AGE 

NON 

1i 

- T9! 

20i 4 

total 

40-49 

3.91 

7784.8 

1'9:12' 

15927.8 

20102 

12024.0 

43105 

35700.6 1 

50-59 

12.49 

6813.7 

38.20: 

12987.1 

26.95 

7661.2 

77:64 

27462.0 I' 

60-69 

14.25 

5496.6. 

25.70 

8604.9 

8.68 

3291.1 

48.63 

17392.6 ; 

70 

32.02 

1835*2 

9.93 

6664.2' 

20.79 

2484.7 

62.74 

10984.8 j 

TOTAL 

62.67 

21895 

92.95 

44184 

76.44 

25461 

232.06 

91640 j 


TABLE 3: SMOKING HABlT OF HUSBAND BY AGE OF WIFE (Table 2 
of. HiRAYAMA 1934). Removed: selection bias: Data adjusted 
to the age distribution of women m the popuiationi 


HUSBAND'S SMOKING' HABITS 


A 

RR 

1.00 

1.37 

T.56 

i 

'So 


1.00’ 

i.in. 

1 

1 

mhlChi: 


T.51 

2.27. 


^one tailed 


.065 

.012* 

1 

A 

RR. 

IlO.O' 

.7,7 

1.06 

p 

'So 


.59 

.80' 


mh-chi 


2.19 

.27 

[ 

^ONE TAILED: 


.014* * 

.395 

1 


UPPER PART STANDARDIZED: BY AGE OF V.’OMEN ONLY 

lower part : AGE SELECTION BIAS; REMOVED: AND STANDARDIZED 

BY AGE OF WOMEN 


RR: Weighteo point estimate of rate ratio 

ILgo * Lower 90 percent confidence interval 

•' • "significant" in positive direction 

*• - "significant" in negative direction 

table 4: RELATIVE RISK BY AGE OF WOMEN! {Calculated from 
table 2 of HIRAYAMA 1984): 

In the upper part of table 4 one finds the risk ratios standardized: by ene 
only, as reported by HIRAYAMA. The lower part' contains the nsK ratios af¬ 
ter removing the age selection bias. In the upper part' the weighted point 
estimate of the rate ratio is 1l56 in the 20 ♦ group and is technically 
"significant". ILg^ designates the lower, pomt of the 90-percent confidence 
interval in this and the following tables, as it was used by HIRAYAMA. 
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The risk increase disappears completely when one removes the selection bias by 
age. In the 20 * group the rate ratio is 1.06i hardly a relevant risk increase. 

In the group of T-T9 cigarettes per diy it is .77 which is a technically signi¬ 
ficant risk decrease. The adjusted: rate ratio, considering all those exposed 
in one group versus those not exposed is .81 with a confidence interval inclu^- 
ding unity. If. HIRAYAMA had observed the cases as they, occur in the population 
without seiectioni bias by age, he would have observed no risk increase, but a 
sligth and meaningless risk decrease. This is the main, result of our reanalysis, 
which corresponds well with the result of the prospective American cohort study 
as publishediby GARFINKEL (2). 

In the discussion following our paper in TOKIO last november HIRAYAMA noted, 
that in the population the percentage of women over 70 married to men who are 
still alive is smaller than the percentage of. women reported in the population » 
stattstics. Since we do. not have the numbers we assume that only half of the 
women over 70 reported in the population census 1965 have been married to li¬ 
ving husbands. The resulting hypothetical population together, with the HIRAYAMA 
cohort is presented in table 5. 


AGE 

GROUP 


PERCENT FEMALE 

hypothetic 

POPULATION 


HIRAYAMA 

COHORT 


40-49 

42 


50-59 

32 

35 ^ 

60-69 

20 

22 

70 ♦: 

6 

1 1 



N/ 


100 


100 


table 5; DIFFERENCES BETWEEN THE HIRAYAMA COHORT AND A HYPO¬ 
THETIC female AGE DISTRIBUTION OVER 40. (Explanation see text) 


There is still possibly a selection bias in table 5. Now 6 percent of women 
over 70 would have been included in the hypothetic female distriibution in¬ 
stead: of 12 percent. The corresponding lUng cancer cass, which, generally are 
more frequent in. this age group than in younger women, had been excluded. The 
reduction, to one half accounts for the argument of HIRAYAMA, mentioned above 
sufficiently. The resulting relative risks are presented in table 6. Even with 
these assumptions the relative risk is only 1^03 in the group of women married 
to husbands smoking 1-19 cigarettes per day, 1..29 in the 20 *. group and U12 
if one considers the smoking group altogether. All these risk ratios are not 
statistically different from unity. 

HUSBAND'S SMOKING HABITS 

1-19 20 ♦ SMOKER 


/V 

RR 

1.00 

1.03 

1J29 

1.12 

"-90 


.77 

.94 

.65 

MH-CHI. 


.05 

1.33 

.47 



TABLE 6: RELATIVE RISK BY AGE OF WOMEN, AGE SELECTION BIAS REMOVED 
AND'HYPOTHETIOWy ADJUSTED TO HIRAYAMAS ARGUMENT 
(see table 5) 
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Since it is impossible fori us to reconstruct the real situation some twenty 
years agoi ini Japan' regarding the oondi.tiional distributions of males and fe¬ 
males regarding age, smoKmg and family status, the reported results of the 
HtRAYAMA study can not be conclusive to us. As long as the selection bias by 
age can not' be explained numencally in a sufficient way by HtRAYiAV.A, his 
thesis, thati there is a significant and relevanti risk ir^crease based on his 
data migth as well be wrong. 

We now consider two occupations, farmers and indiistfiy. workers. From the upper 
part of table 7 one can see that the relative risk fori wives of farmers seems 
substantia!, when, one standardizes by'age of men. only.. The point estiimates of 
the rate ratios are l.'iS and 1.63 respectively; This was observed earlier and 
had no adequate exianation. If one removes the selection bias by age and ad¬ 
justs to the male age distnbutioni of Japan - the numbers m.tbe tower part 
of table 7' - the rate ratios are .85 and .82, noti different from unity. This 
seems more plausible. 


HUSBANDi'S SMOKING HABITS 
NON 1-19 20 ♦ 


A 

RR 

1.00 

1.48 

1.6:3 

r 

o 


.97 

1.01 

MH-CHl 


1.48 

1.92 

^ONE TAILED 


.069 

.027 

A 

RR 

i'.ao 


.82 

^’-90 


.59 

.53 

MH-GHl 


.42 

.53: 

^ONE TAILED 


.337' 

.296. 


UPPER PART :: STANDARDIZED BY AGE OF M€N. ONLY' 

LOWER. PART, AGE SELECTION BIAS REMOVED' AND: STANDARDIZED 

BY AGE OF MEN 

table 7: RELATIV-E RISKS; WIVES OF FARMERS ONLY, 

(Table 3 of MtRAYAMA 198^) 

Considering; the wives of industry workers only, in the upper pact of table 0 
the point estimates of the rate ratiios are 1.77' and 2.27, standardized by age 
of men, being not: significant. Removing the age selection bias - in the lo¬ 
wer part ofi table B - there is a remarkable nsk increase to 4,60 and: 6.90, 
which is Significant. However,, there are only 9 lung cancer deaths in the 20<* 
group and only 3: m, women 70 years and oldteri, which are small numbers, but 
these are numbers observed and used by, HIRAYAMA andihts risk Structure is 
unchanged. Thus only in the subgroup of women married to industry workers there 
is a risk increase, in all other occupations there is no risk increase. Omitting 
industry workers, the point estimates of the rate ratios are .90 and .89,, 
not: signifiicantly differient from, unity. These findings are consistent with the 
Lssumpnon of confounding, factors lO'women married to, mdustriy. workers, whc 
miQth be exposed to other enviromental hazards. Our calculations show that by 
removing age selection, bias by age, one can explain hitherto implausible re¬ 
sults. 
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HUSBAND’S SMOKING HABlTiS 
NON 1-19 20' -► 


A 

RR 

"“90 

MH^CHI 
^ONS TAILED 

1.00 

1.77 

.70: 

.73 

.232' 

2.27 

,84 

.81 

.208 

A 

RR 

1.00 

4.60 

6.90 

"“90' 


1.71 

2.45 

MH-CHI 


2.50 

2.78 

p 


.006 

.003 

ONE tailed 





UPPER PART STANDARDIZED BY AGE OF MEN'ONLY 

lower part AGE SELECTION: BIAS REMOVED AND STANDARDIZED 
BY AGE OF MEN 

TABLE 8: RELATIVE RISKS: WIVES OF INDUSTRY WORKERS ONLY 
(Table 3 of HIRAYAMA 1984) 


Active smoking is correlated among married couples. Ir. a society in wh^ch fe- 
malfe smokers were very rare in 1965, more women married to smokers willl declare 
tbemselUes nonsmokers thanithe other way round. One has therefore to consider 
biased or differentiar miscibssification. There are likely more women with lung 
cancer, who. have been misdassifled as nonsmokers than the other: way round. They 
have tO'be removed from the cohort. We made some moderate assumptions regarding 
misciassiftcation, as shown in table 9. In order to examine, how sensitive the 
relative risk, is we removed 10, 20 and: 30 cases from the exposed groups corres¬ 
ponding to 5, 10 and 15 percent. Assuming 30 misdassified cases - 15 percent, 
a percentage which has been observed in the literature (5) - the rate ratios 
are ,66 and ,85. In the group 1-19 cigarettes per day all the risk estimators 
are significantly smaller than unity. Our personal opinion is that 10 differen¬ 
tial misdassified cases from 200 is a fair number. The corresponding weigthed 
point estimates of the rate ratio are .74 and 1.00. These risk estimates are 
as reasonable as other risk estirnates calculated from the HIRAYAMA data. They 
indicate - if anything - a risk decrease, not a risk increase. 
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NUMBER OF CASES ASSUMED 
MISGLASSIFIED AND REMOVED 
FROM EXPOSED GROUPS 


n - 10 r S % 


n * 20 a 10 % 


n = 301 = 15 % 


ONE TAIUED 

^ONE tailed 
i(r 

^ONE tailed 


HUSBAND’S 

SMOKING 

HABITS 

NON 

1-19 

20 ♦ 

T.OO 

.74 

I.OQ; 


.006 

.469: 

1.00 

JO 



.003 

.383 

1.00 


.65 


.001! 

.238 


table 9: RELATIVE RISK: ASSUMED DIFFERENTIAL MlSCUASSlFlCATlON 
(Age selectiion bias removed and standardized by age of women) 


DISCUSSION 


Reanaiyses of data, which have been collected by others are not easy: This is 
because information is not complete'y avaiJable,, because information might be 
misinterp'eted: or because one has to take another view in orderi tO' comie doseri 
to the acceptable tiruth. Ouri calculations do not dimmish the great value and 
impact the HIRAYAMA study had on the epidem.iology of passive smiokmg. They 
show however, that reasonable allternative views on:the same data are possible, 
which lead tO'opposite conclusions. Our fmdings. are m contrast, to HIRAYAMA'^ 
thesis that - based on his data - tnere is a substantial statistical asscs'ation 
between passive smoking: and lung cancer. We do not hold'that our view is 
the only, correct one. We do hold However,, that the risk; ratios calculated' by 
us, removing age selection bias^ are as reasonable as the ones calculated by 
HIRAYAMA. Since they go back; to the population and not tO'a selected sample 
our estimates could be preferable. Hypotheticalty; accounting for the argumenti 
of HIRAYAMA, that in the population the percentage of women over 70 married 
toimen who are st'iH; alive is smaller: than the percentage of women reported 
m the population statistics does not change our results. Our risk estimates 
are a consequence of the data published by HIRAYAMA and can not be rejected 
from the study data, as they are published soi far. 
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HUSBAND' 

'S SMOKING 

HABITS 



NON 

1-19 

20 ♦ 


A 




AGE selection BIAS 

RR 

1.00 

.77 

1.06 

REMOVED AND AGE- 



.014 

.395 

standardized (WOMEN) 

^ONE 

tailed 


A 




WITHOUT INDUSTRY 
WORKERS, AGE SELECTION' 

RR 

^ONE 

1.00 

.90 

.394 

.89 

.179 

BIAS; REMOVED' AND' AGE- 
STANDARDIZED (MEN) 

tailed 




A 



10 CASES ASSUMED' 

RR 

1.00 

.74 

1.00 

MISCLASSlFlEDi AGE 
SELECTION BIAS 

REMOVED AND AGE- 
STANDARDlZED (WOMEN)' 

^ONE 

tailed 

.006 

.469 


table 10: reanalysis OF HlRAYAMAS DATA: SUMMARY OF RELATIVE RISKS 

To summarize: Removing the age selection bias in the HlRAYAMA study one gets a 
relative risk of 1.06 in the group of women marriedi to men with more than 20' 
cigarettes per day. tn the group of women marned to men with 1-19 cigarettes 
per diy the relative risk is .77, a technically "significant" risk decrease. If 
HlRAYAMA could have observed the lung cancer cases as they occur in the female 
population, he would have oOserved no risk increase, but' a risk decrease to 
around .81, not significantly different from unity, considering those exposed 
versus those not exposed^ 

If one omits the wives married to industry workers because of possible confoun¬ 
ding factors the relative risk, is .90 and .89 respectively. This is of the same 
size order and smaller than unity: Here we could adjust and standardize by occu¬ 
pation and age of men only, which is not as appropriate as by age of womens 

If one assumes that 10: cases are differentially misdassifiied and removes themi 
fromithe exposed groups^ the risk estimates are .74 and 1.00 respectively; 

Our findings demonstrate how sensitive the data of this study are and how week 
the evidence for a statistical association between passive smokingi and lung can¬ 
cer from this study is. In view of these and other facts, which we mentioned if) 
the introduction, the null hypothesis might be true as well and is consistent 
with the HlRAYAMA date In the seme way as the alternetive hypothesis. 

Wa would be glad to apply our technique to more detailed data if we can get 
them from HlRAYAMA, for instance Ini order to adjust by occupationi of men and 
age of women, or by occupation of men and by age of women merrted to a hus¬ 
band who is still alive. We are ready to modify our view if such data can sup¬ 
port the alternative hypothesis better than the published date do. We do h^e, 
that our calculations give rise to a fruitful discussion. The methods we used 
here might be of inttrast to the analysis of other cohort and control i studies. 
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